Search | WHO COVID-19 Research Database

Distinguishing Admissions Specifically for COVID-19 From Incidental SARS-CoV-2 Admissions: National Retrospective Electronic Health Record Study.

Klann, Jeffrey G; Strasser, Zachary H; Hutch, Meghan R; Kennedy, Chris J; Marwaha, Jayson S; Morris, Michele; Samayamuthu, Malarkodi Jebathilagam; Pfaff, Ashley C; Estiri, Hossein; South, Andrew M; Weber, Griffin M; Yuan, William; Avillach, Paul; Wagholikar, Kavishwar B; Luo, Yuan; Omenn, Gilbert S; Visweswaran, Shyam; Holmes, John H; Xia, Zongqi; Brat, Gabriel A; Murphy, Shawn N.

J Med Internet Res ; 24(5): e37931, 2022 05 18.

Article in English | MEDLINE | ID: covidwho-1862520

ABSTRACT

BACKGROUND: Admissions are generally classified as COVID-19 hospitalizations if the patient has a positive SARS-CoV-2 polymerase chain reaction (PCR) test. However, because 35% of SARS-CoV-2 infections are asymptomatic, patients admitted for unrelated indications with an incidentally positive test could be misclassified as a COVID-19 hospitalization. Electronic health record (EHR)-based studies have been unable to distinguish between a hospitalization specifically for COVID-19 versus an incidental SARS-CoV-2 hospitalization. Although the need to improve classification of COVID-19 versus incidental SARS-CoV-2 is well understood, the magnitude of the problems has only been characterized in small, single-center studies. Furthermore, there have been no peer-reviewed studies evaluating methods for improving classification. OBJECTIVE: The aims of this study are to, first, quantify the frequency of incidental hospitalizations over the first 15 months of the pandemic in multiple hospital systems in the United States and, second, to apply electronic phenotyping techniques to automatically improve COVID-19 hospitalization classification. METHODS: From a retrospective EHR-based cohort in 4 US health care systems in Massachusetts, Pennsylvania, and Illinois, a random sample of 1123 SARS-CoV-2 PCR-positive patients hospitalized from March 2020 to August 2021 was manually chart-reviewed and classified as "admitted with COVID-19" (incidental) versus specifically admitted for COVID-19 ("for COVID-19"). EHR-based phenotyping was used to find feature sets to filter out incidental admissions. RESULTS: EHR-based phenotyped feature sets filtered out incidental admissions, which occurred in an average of 26% of hospitalizations (although this varied widely over time, from 0% to 75%). The top site-specific feature sets had 79%-99% specificity with 62%-75% sensitivity, while the best-performing across-site feature sets had 71%-94% specificity with 69%-81% sensitivity. CONCLUSIONS: A large proportion of SARS-CoV-2 PCR-positive admissions were incidental. Straightforward EHR-based phenotypes differentiated admissions, which is important to assure accurate public health reporting and research.

Subject(s)

COVID-19 , SARS-CoV-2 , COVID-19/diagnosis , COVID-19/epidemiology , Electronic Health Records , Hospitalization , Humans , Retrospective Studies

An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes.

Estiri, Hossein; Strasser, Zachary H; Rashidian, Sina; Klann, Jeffrey G; Wagholikar, Kavishwar B; McCoy, Thomas H; Murphy, Shawn N.

J Am Med Inform Assoc ; 29(8): 1334-1341, 2022 07 12.

Article in English | MEDLINE | ID: covidwho-1831208

ABSTRACT

OBJECTIVE: The increasing translation of artificial intelligence (AI)/machine learning (ML) models into clinical practice brings an increased risk of direct harm from modeling bias; however, bias remains incompletely measured in many medical AI applications. This article aims to provide a framework for objective evaluation of medical AI from multiple aspects, focusing on binary classification models. MATERIALS AND METHODS: Using data from over 56 000 Mass General Brigham (MGB) patients with confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), we evaluate unrecognized bias in 4 AI models developed during the early months of the pandemic in Boston, Massachusetts that predict risks of hospital admission, ICU admission, mechanical ventilation, and death after a SARS-CoV-2 infection purely based on their pre-infection longitudinal medical records. Models were evaluated both retrospectively and prospectively using model-level metrics of discrimination, accuracy, and reliability, and a novel individual-level metric for error. RESULTS: We found inconsistent instances of model-level bias in the prediction models. From an individual-level aspect, however, we found most all models performing with slightly higher error rates for older patients. DISCUSSION: While a model can be biased against certain protected groups (ie, perform worse) in certain tasks, it can be at the same time biased towards another protected group (ie, perform better). As such, current bias evaluation studies may lack a full depiction of the variable effects of a model on its subpopulations. CONCLUSION: Only a holistic evaluation, a diligent search for unrecognized bias, can provide enough information for an unbiased judgment of AI bias that can invigorate follow-up investigations on identifying the underlying roots of bias and ultimately make a change.

Subject(s)

COVID-19 , Artificial Intelligence , Humans , Reproducibility of Results , Retrospective Studies , SARS-CoV-2

What Every Reader Should Know About Studies Using Electronic Health Record Data but May Be Afraid to Ask.

Kohane, Isaac S; Aronow, Bruce J; Avillach, Paul; Beaulieu-Jones, Brett K; Bellazzi, Riccardo; Bradford, Robert L; Brat, Gabriel A; Cannataro, Mario; Cimino, James J; García-Barrio, Noelia; Gehlenborg, Nils; Ghassemi, Marzyeh; Gutiérrez-Sacristán, Alba; Hanauer, David A; Holmes, John H; Hong, Chuan; Klann, Jeffrey G; Loh, Ne Hooi Will; Luo, Yuan; Mandl, Kenneth D; Daniar, Mohamad; Moore, Jason H; Murphy, Shawn N; Neuraz, Antoine; Ngiam, Kee Yuan; Omenn, Gilbert S; Palmer, Nathan; Patel, Lav P; Pedrera-Jiménez, Miguel; Sliz, Piotr; South, Andrew M; Tan, Amelia Li Min; Taylor, Deanne M; Taylor, Bradley W; Torti, Carlo; Vallejos, Andrew K; Wagholikar, Kavishwar B; Weber, Griffin M; Cai, Tianxi.

J Med Internet Res ; 23(3): e22219, 2021 03 02.

Article in English | MEDLINE | ID: covidwho-1088863

ABSTRACT

Coincident with the tsunami of COVID-19-related publications, there has been a surge of studies using real-world data, including those obtained from the electronic health record (EHR). Unfortunately, several of these high-profile publications were retracted because of concerns regarding the soundness and quality of the studies and the EHR data they purported to analyze. These retractions highlight that although a small community of EHR informatics experts can readily identify strengths and flaws in EHR-derived studies, many medical editorial teams and otherwise sophisticated medical readers lack the framework to fully critically appraise these studies. In addition, conventional statistical analyses cannot overcome the need for an understanding of the opportunities and limitations of EHR-derived studies. We distill here from the broader informatics literature six key considerations that are crucial for appraising studies utilizing EHR data: data completeness, data collection and handling (eg, transformation), data type (ie, codified, textual), robustness of methods against EHR variability (within and across institutions, countries, and time), transparency of data and analytic code, and the multidisciplinary approach. These considerations will inform researchers, clinicians, and other stakeholders as to the recommended best practices in reviewing manuscripts, grants, and other outputs from EHR-data derived studies, and thereby promote and foster rigor, quality, and reliability of this rapidly growing field.

Subject(s)

COVID-19/epidemiology , Data Collection/methods , Electronic Health Records , Data Collection/standards , Humans , Peer Review, Research/standards , Publishing/standards , Reproducibility of Results , SARS-CoV-2/isolation & purification

Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data.

Klann, Jeffrey G; Estiri, Hossein; Weber, Griffin M; Moal, Bertrand; Avillach, Paul; Hong, Chuan; Tan, Amelia L M; Beaulieu-Jones, Brett K; Castro, Victor; Maulhardt, Thomas; Geva, Alon; Malovini, Alberto; South, Andrew M; Visweswaran, Shyam; Morris, Michele; Samayamuthu, Malarkodi J; Omenn, Gilbert S; Ngiam, Kee Yuan; Mandl, Kenneth D; Boeker, Martin; Olson, Karen L; Mowery, Danielle L; Follett, Robert W; Hanauer, David A; Bellazzi, Riccardo; Moore, Jason H; Loh, Ne-Hooi Will; Bell, Douglas S; Wagholikar, Kavishwar B; Chiovato, Luca; Tibollo, Valentina; Rieg, Siegbert; Li, Anthony L L J; Jouhet, Vianney; Schriver, Emily; Xia, Zongqi; Hutch, Meghan; Luo, Yuan; Kohane, Isaac S; Brat, Gabriel A; Murphy, Shawn N.

J Am Med Inform Assoc ; 28(7): 1411-1420, 2021 07 14.

Article in English | MEDLINE | ID: covidwho-1075534

ABSTRACT

OBJECTIVE: The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing coronavirus disease 2019 (COVID-19) with federated analyses of electronic health record (EHR) data. We sought to develop and validate a computable phenotype for COVID-19 severity. MATERIALS AND METHODS: Twelve 4CE sites participated. First, we developed an EHR-based severity phenotype consisting of 6 code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of intensive care unit (ICU) admission and/or death. We also piloted an alternative machine learning approach and compared selected predictors of severity with the 4CE phenotype at 1 site. RESULTS: The full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability-up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean area under the curve of 0.903 (95% confidence interval, 0.886-0.921), compared with an area under the curve of 0.956 (95% confidence interval, 0.952-0.959) for the machine learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared with chart review. DISCUSSION: We developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly owing to heterogeneous pandemic conditions. CONCLUSIONS: We developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.

Subject(s)

COVID-19 , Electronic Health Records , Severity of Illness Index , COVID-19/classification , Hospitalization , Humans , Machine Learning , Prognosis , ROC Curve , Sensitivity and Specificity

Predicting COVID-19 mortality with electronic medical records.

Estiri, Hossein; Strasser, Zachary H; Klann, Jeffy G; Naseri, Pourandokht; Wagholikar, Kavishwar B; Murphy, Shawn N.

NPJ Digit Med ; 4(1): 15, 2021 Feb 04.

Article in English | MEDLINE | ID: covidwho-1065966

ABSTRACT

This study aims to predict death after COVID-19 using only the past medical information routinely collected in electronic health records (EHRs) and to understand the differences in risk factors across age groups. Combining computational methods and clinical expertise, we curated clusters that represent 46 clinical conditions as potential risk factors for death after a COVID-19 infection. We trained age-stratified generalized linear models (GLMs) with component-wise gradient boosting to predict the probability of death based on what we know from the patients before they contracted the virus. Despite only relying on previously documented demographics and comorbidities, our models demonstrated similar performance to other prognostic models that require an assortment of symptoms, laboratory values, and images at the time of diagnosis or during the course of the illness. In general, we found age as the most important predictor of mortality in COVID-19 patients. A history of pneumonia, which is rarely asked in typical epidemiology studies, was one of the most important risk factors for predicting COVID-19 mortality. A history of diabetes with complications and cancer (breast and prostate) were notable risk factors for patients between the ages of 45 and 65 years. In patients aged 65-85 years, diseases that affect the pulmonary system, including interstitial lung disease, chronic obstructive pulmonary disease, lung cancer, and a smoking history, were important for predicting mortality. The ability to compute precise individual-level risk scores exclusively based on the EHR is crucial for effectively allocating and distributing resources, such as prioritizing vaccination among the general population.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL